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ABSTRACT 

Traditionally, recommender systems for the Web deal with 
applications that have two dimensions, users and items. Bas- 
ed on access logs that relate these dimensions, a recommen- 
dation model can be built and used to identify a set of TV 
items that will be of interest to a certain user. In this pa- 
per we propose a method to complement the information in 
the access logs with contextual information without chang- 
ing the recommendation algorithm. The method consists in 
representing context as virtual items. We empirically test 
this method with two top-A'^ recommender systems, an item- 
based collaborative filtering technique and cissociation rules, 
on three data sets. The results show that our method is able 
to take advantage of the context (new dimensions) when it 
is informative. 
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1.2.6 [Artificial Intelligence]: Learning — Induction 
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1. INTRODUCTION 

Most Web sites offer a large number of information re- 
sources to their users. Finding relevant content has, thus, 
become a challenge for users. Recommender systems have 
emerged in response to this problem. A recommender sys- 
tem for a Web site receives (implicit or explicit) information 
about users and their behavior and recommends items that 
are likely to fit his/her needs |12j . 

Recommender models for Web personalization can be built 
from the historical record of accesses to a site, where one ac- 
cess is a pair < user jid, item >. Each access is interpreted 



as a rating of 1 given by the user to the item. However, other 
dimensions, such as time and location, can add contextual 
information and improve the accuracy of recommendations. 
For instance, the type of books that a user looks for in Ama- 
zon during work hours is probably different from the books 
searched for during leisure hours. 

According to [ll], the idea that contextual information 
is important when predicting customer behavior is not new. 
Many Web sites are supported by Content Management Sys- 
tems (CMS), that often store much contextual information. 
However, this is not true in all cases and, additionally, get- 
ting information that is really relevant for recommendation 
is a hard task in many applications Adomavicius et 

al. [1] have investigated the use of context for rating estima- 
tion in multidimensional recommender systems. Palmisano 
et al. [11] have used contextual information to improve the 
predictive modeling of customer's behavior. Both authors 
have developed a special-purpose browser to obtain rich con- 
textual information. 

In this paper we exploit how contextual information can 
be used to improve the accuracy of Top-A'^ Recommender 
Systems. Existing contextual recommender systems typ- 
ically use contextual information as a label for segment- 
ing/filtering sessions, using them to build the recommenda- 
tion model (e.g., [l][ll]). We follow an alternative approach, 
which uses the contextual attribute as a virtual item. This 
means that it is treated as an ordinary item for building the 
recommendation model, which has the advantage of allow- 
ing the use of existing recommendation algorithms. As our 
contextual information are obtained from multidimensional 
data, we have called our approach DaVI [Dimensions as 
Virtual Items). Instead of a special-purpose browser [T]|ll|. 
we collect the multidimensional data from Web access logs 
and from attributes stored in databases of the Web sites. 
We have empirically tested our approach with two recom- 
mendation techniques, item-based collaborative filtering and 
association rules, to assess the effect of adding context on 
the accuracy of traditional Web recommender systems. We 
present results obtained on three data sets. 

In the following section, we present the contextual infor- 
mation used in our experiments. Next, we describe the rec- 
ommendation techniques and the approach proposed. Then, 
we discuss results and present conclusions and future work. 
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2. CONTEXTUAL INFORMATION 

There are many definitions of context in the literature 
depending on the field of application and the available cus- 
tomer data [ll]. In this paper, context is defined as any 



information that can be used to characterize the situation 
of an entity [5]. Here an entity is an access to an item/Web 
page by a user. 

A critical issue is how to obtain the rich contextual infor- 
mation In some circumstances, context is exphcit, such 
as a person informing a movie recommender system where 
he/she wants to watch a movie. On the other hand, the 
contextual information can also be inferred from Web ac- 
cess data. For example, we can observe if a person bought 
an item, from an e-commerce Web site, on a weekday or a 
weekend, from the Web access logs. 

Besides general contextual information that can be ob- 
tained from access logs, we may use domain-specific informa- 
tion, that is typically collected from the CMS. For example, 
if an item represents an access to a music, the genre of the 
music can be used as a dimension of contextual information. 

In Table[T]we present the dimensions/contextual informa- 
tion considered in the experiments presented in this paper. 
The first group of contextual information was obtained by 
pre-processing Web access logs. The second group was col- 
lected from the CMS of a Web site of Portuguese MusicQ 
used in this study. The last group refers to a public data 
se10 that contains a record of user interactions with the En- 
tree Chicago restaurant recommender system. All the infor- 
mation is stored in a data warehouse that was specifically 
designed for modeling Web sites 



Table !: Contextual information 
Description 



Context 



Day of each access (from 01 to 31). 
Month of each access (from 01 to 12). 
Week day of each access (from Monday to 
Sunday). 

If the accesses were made during the week 
(from Monday to Friday) or weekend (Sat- 
urday or Sunday). 
Hour of each access (from 01 to 24). 
If the accesses were made during the work- 
ing time (from 8 a.m. to 6 p.m.) or not. 
Location where the accesses were made 
(country) . 

'i'he genre of a music. 'I'here are 45 ditter- 
ent musical genres, for instance, pop, rock, 
jazz, and so forth. 

The band which plays a music. There are 
2296 different bands in our music recom- 
mendation data sets. 
instrumental If a music is instrumental or not. 



day 

month 

week-day 

workaday 



hour 

workJiour 
location 



musiC-genre 



band 



intention 'i'he intention of navigation in a restau- 
rant recommendation system (for exam- 
ple, the search for a restaurant cheaper, 
closer, more traditional, more creative, and 
so forth). There are 9 different intentions 
of navigation in our experiments. 



3. RECOMMENDER SYSTEMS 

A recommender system for the Web typically outputs an 
ordered list of recommendations, given a trail of recent Web 
page requests. Historical information about the behavior of 
the users of the site and the current session are used to sug- 
gest certain pages or services, or even the purchase of certain 

^http:/ /www. palcoprincipal.pt. 

^http:/ /archive. ics.uci. edu/ml/datasets/Entree-|-Chicago-|- 
Recommendation-|-Data. 



products [12]. In the context of the Web, a session can be 
abstracted to a set of pairs < user _id, item >, recorded at 
moments close in time, with the same user_id. 

Usually a recommender system is divided into a two-stage 
process 3 . The first stage is carried out offline. Data repre- 
senting the behavior of users of the Web site, which was pre- 
viously collected are mined and a model is generated for use 
in future online interactions. The second stage is carried out 
in real-time with a new user interacting with the Web site. 
Data from the current user session are used as input by the 
model to generate a list of A*' recommendations. A number 
of algorithms have been used for offline model building [3], 
including Collaborative Filtering, Item-Based Collaborative 
Filtering, Association Rules and Markov Models. In this 
section we briefly describe the two algorithms used in this 
work, Item-Based Collaborative Filtering and Association 
Rules, and how we have applied DaVI {Dimensions as Vir- 
tual Items) on these algorithms. 

3.1 Item-Based Collaborative Filtering 

Item-based collaborative filtering (CF) analyzes stored ac- 
cesses (grouped in sessions) to identify relations between 
the items in the set I, which contains all items of a Web 
site [TD] . The recommendation model is a matrix represent- 
ing the similarities between all pairs of items, according to 
a chosen similarity measure. An abstract representation of 
a similarity matrix is shown as Tabled 

Table 2: Item- item similarity matrix. 
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In Table [21 each item i £ / is an accessed page. The 
similarity measure used here is the cosine angle, defined by 

sim(iki , ifc2 ) = cos{ik[, ik^) 



W'ki ll'llifcjl 



where iki and ik2 are binary vectors with as many positions 
as existing users. The value 1 means that the users accessed 
the respective item/page. The value is the opposite. The 
"." denotes the dot-product of the two vectors. 

Given a user who accessed the set of items O C /, the 
model generates a recommendation by selecting the TV which 
are the most similar to the items in the set O. Here, the sim- 
ilarity for each item i ^ O is given by the weighted average 
of its nearest neighbors with respect to their presence in the 
set O. 

Table 3: Similarity matrix with the contextual in- 
formation day. 
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When we apply DaVI on the item-based collaborative fil- 
tering algorithm, it treats the contextual attributes as new 



items (virtual items) in the data set. This means that it 
adds a new row and column for each different value of the 
context to the former similarity matrix and calculates the 
corresponding similarity values, among the values of the 
context and the other items, as presented previously. A 
representation of a similarity matrix with contextual infor- 
mation day = {di,d2,--- ,dv} is shown on Table O Here, 
an item can be a page or a possible value for the context 
day (1 to 31). Although the contextual information is used 
in the models, only pages are recommended. The recom- 
mendations will be the set of pages that are most similar 
to a given set of observable items O C {/ U day}. The ra- 
tionale behind this approach is that the similarity between 
a given item and a given day (for example) is higher if the 
item tends to be accessed on that day of the month. This 
way, the relation between items and the context is captured. 
When a recommendation is made for an active session, the 
value of the context on that particular session (e.g., the day 
of the month the active session is taking place) is used to 
provide the contextual information. 

3.2 Based on Association Rules 

A recommendation model M based on association rules 
(AR) is a set of rules R, each of them with the form A ^ B, 
where A and B are sets of items. Each AR is characterized 
by their support and confidence [2] . The model is generated 
from a set of Web sessions, consisting of a set of pairs < 
id, item > with the same id, where id and item identify 
the user and the accessed page. Given a set of observable 
items O, the set of rules R is used to recommend a set of 
items/pages Recs, as follows: 

Recs = {consequent{ri)\ri G M and antecedent{ri) C O 
and consequentiri) ^ O}. 

To obtain the top A'" recommendations, we select from Recs 
the distinct recommendations corresponding to the rules 
with the highest confidence. In our work we use the Carerfl 
association rules generator. 

Extending AR to handle contextual information by apply- 
ing DaVI, simply consists of including extra pairs user-item 
into the former set of sessions. For example, to use the di- 
mension day, we add a pair < id, day — value > to the 
respective session with tag id, where day — value represents 
the day of the month when the session id occurred. The set 
of augmented sessions are used as input to the recommen- 
dation algorithms. The rules built will include both actual 
items and virtual items on the antecedent and only actual 
items on the consequent. Given an active session occurring 
on day x, the set of observables O includes the items in the 
active session and the virtual items (e.g. day = x). 

Notice that DaVI does not modify the recommendation 
algorithms. It just inserts the contextual information as 
virtual items in the data sets. Thus we can easily extend 
DaVI to other recommendation methods. 

4. EMPIRICAL EVALUATION 

In this section we evaluate how DaVI can improve the ac- 
curacy of the recommendation algorithms presented in Sec- 
tions O and E3] 

^ htt p : / / www .di.uminho.pt/''pja/class/caren.html. 



4.1 Experimental Setup 

The evaluation is carried out on three different data sets 
(Table 13)). The Listener data set contains accesses to songs 
in the music Web site mentioned earlier. The Playlist data 
set represents the set of songs explicitly selected by users of 
the same site for their individual playlists. Entree is a public 
data set that contains a record of user interactions with the 
Entree Chicago restaurant recommender system. 



Table 4: Characteristics of the data sets 



Data sets 


# Accesses 


7^ Items 


7^ Users 


Listener 


62208 


6428 


9740 


Playlist 


37022 


5428 


4417 


Entree 


149849 


639 


31440 



To measure the accuracy of the recommender systems we 
use the All But One protocol [4]. In this protocol, the ses- 
sions in the data set are split randomly into train and test. 
In our case, 80% for training and 20% for testing. The train- 
ing set is used to generate the recommendation model (sim- 
ilarity matrix or association rules). For each session in the 
test set we randomly delete one pair < id, item >, referred 
to as hidden item. The remaining pairs represent the set 
of observables, O, based on which the recommendation is 
made. 

The model is evaluated by comparing, for each session 
in the test set, the set of recommendations it makes (Rec), 
given the set of observables, O, against the hidden item. 
The set of recommendations rec\,rec2, ...,recN for a given 
user id is represented as {< id,rec\ >,< id,rec2 >,...,< 
id,recN >} and N is the number of recommendations pro- 
duced by the model. Based on the set of recommendations 
and the hidden item for all the session in the test set, we 
measure Recall, Precision and the Fl metric |13l 112) : 

Recall = - — !„. . . — i — -, Precision = J — , '-, 

2 X Recall X Precision 

Recall-\- Precision 

Recall corresponds to the proportion of relevant recom- 
mendations. Precision gives us the quality of each individ- 
ual recommendation. Fl is a measure that combines Recall 
and Precision with an equal weight. It ranges from to 1 
and higher values indicate better recommendations. Global 
recall, precision and Fl are obtained by averaging individual 
test user values. 

For the recommendation models based on association rules 
(AR), we chose a minimum support value trying to keep 
at least 50% of the items of the data sets for building the 
models. The minimum confidence values were defined as 
being the support value of the third most frequent item. 

4.2 Single Dimension 

Here we compare the results of the two algorithms using 
the traditional model (user x item) and with DaVI, ap- 
plied separately to each contextual dimension presented in 
Table [1] The charts in Figure [1] plot the Fl measure. 

4.2.1 Item-Based Collaborative Filtering 

Our results show that DaVI improves item-based CF pre- 
dictive performance when there is a rich contextual dimen- 
sion. We can observe this with the dimension band in Lis- 
tener and Playlist data sets (Figure [T] (a) and (b)). In Fig- 
ure[l](a) the dimension band yields a maximal value of 0.31 
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(a) CF in Listener data set. 



(b) CF in Playlist data set. 



(c) CF in Entree data set. 
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(d) AR in Listener data set. (e) AR in Playlist data set. (f) AR in Entree data set. 

Figure 1: Fl metric for Listener, Playlist and Entree data sets. 



(top 1). This value represents an Fl average gain of 34% 
compared to the value of Fl without applying DaVI. In 
Figure [1] (b) band provides a maximal value of 0.43 (top 
1). This value represents a gain of 24%. In Entree data 
set (Figure [T] (c)), the only expressive gain using contextual 
information was obtained with the dimension intention, Fl 
value of 0.22 (top 1). This value represents a small gain 
of 5%, which means that the CF model worked worse with 
intention than hand, used in the former data sets. Addi- 
tionally, we can see that the highest gains were obtained 
using dimensions collected from databases of the Web sites. 
The gains using dimensions pre-processed from Web access 
logs were very small. The values show average gains around 
1.3%, which means that the context inferred from Web ac- 
cess logs is not so rich in information. 

4.2.2 Association Rules 

Considering the association rules (AR) technique, our re- 
sults also show that DaVI improves the accuracy of the 
recommendation models. In Figure [T] (d), we have a maxi- 
mum value of 0.21 (top 1 with band). This value represents 
a gain of 14.5% compared to the value of Fl without any 
contextual information. In Figure [1] (e) , the Fl measure for 
the dimension band and the top 2 has the maximal Fl gain, 
23.5%. With respect to the Entree data set (Figure[l](f)) we 



have a maximal value of 0.34 (top 1 with intention). This 
value represents a gain of 9.6%. An interesting fact here is 
that contrarily to the other data sets, the Entree presented 
highest Fl values with the association rules technique then 
with the item-based collaborative filtering. 

4.3 Multiple Dimensions 

So far, we applied DaVI to one contextual dimension at 
a time. However, it may be applied to several dimensions. 
We consider two dilferent scenarios. The first one (called 
all together) simply applies the method to all dimensions 
presented in Table [T] simultaneously. The second scenario 
(called forward selection) uses a sequential forward selec- 
tion algorithm [9], on the training data set, to select the 
best combination of dimensions that will be used to make 
recommendations. The algorithm starts from an empty set 
and sequentially adds the dimension d that results in the 
highest objective function F{D -|- d) on a validation data 
set, when combined with the dimensions D that have al- 
ready been selected. 

Given that there are no other methods that combine sev- 
eral contextual dimensions for Top-A'^ recommendation, we 
compared our method to an adaptation of the Combined 
Reduction approach [I] for this task. Briefly, this approach 
uses the values of the context/dimension as labels for seg- 



meriting Web accesses and was originally developed for rat- 
ing estimation. It consists of the following two phases. First, 
using the training data, a recommendation method is run for 
each contextual segment (e.g. accesses on Mondays would 
be a segment) to determine which ones outperform the tra- 
ditional model (using only user- item information). Second, 
taking into account the context of the active session, we 
choose the best contextual model to make the recommenda- 
tion. Here the best model is the one which has the highest 
Fl value. Here, we have adapted it for the Top- A'' recom- 
mender algorithms presented in Sections 13. II and 13.21 

As baselines, we have used the traditional user-item ap- 
proach and also the results of the best individual dimension 
(called best context), according to the previous experiments. 

In TableO the results for A'^ = 1 show us that DaVI, using 
the best dimension, has Fl values equal or higher than other 
DaVI scenarios. The only exception is the Listener data 
set with the CF model, where the best is DaVI using all 
dimensions. The Combined Reduction approach has values 
equal to and better than DaVI (best context), respectively, 
in Listener and Playlist data sets with the AR model. In 
TableO the symbol "-" means that the algorithm timed-out. 

Table 5: Fl measure for Top-1 recommendations 
CF 

Methods Listener Playlist Entree 

user X item 0.230 0.351 0.2lT^ 

DaVI (best context) 0.315 0.434 0.225 

DaVI (forward selection) 0.311 0.416 0.210 

DaVI (all together) 0.317 0.429 0.225 

Combined Reduction 0.225 0.351 0.212 

AR 

user X item [TTSB TIM 0.315 

DaVI (best context) 0.213 0.270 0.341 

DaVI (forward selection) 0.203 0.261 0.341 

DaVI (all together) - 0.268 0.336 

Combined Reduction 0.213 0.280 0.309 



5. CONCLUSIONS AND FUTURE WORK 

In this paper we presented a direct approach, called DaVI, 
that enables existing recommender systems to take advan- 
tage of contextual information as virtual items. We dis- 
cussed the results obtained using two recommendation tech- 
niques, item-based collaborative filtering and association ru- 
les. Using DaVI with rich contextual information has re- 
vealed a great potential to improve the accuracy of recom- 
mender systems. However identifying rich contextual dimen- 
sions is not an easy task. 

We have also compared different settings using the DaVI 
approach (best dimension, forward selection and all dimen- 
sions) with a more sophisticated Combined Reduction ap- 
proach. Next, we will improve this empirical study and pro- 
pose a method to identify rich contextual information from 
Web sites that can be used with DaVI. 
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